The Business Question

"They don't make 'em like they used to." You've heard it before. Maybe you've even said it. When you scroll through Netflix and see Casablanca with an 8.5 rating next to a new release with a 6.2, it's easy to believe that movies have gotten worse over time.

But is this actually true? Or are we falling victim to a statistical illusion?

Testable Hypotheses

  • H1 (Initial Observation): Older movies have higher average ratings than recent movies
  • H2 (Survivorship Bias): This effect is due to selective preservation - only good old movies survived
  • H3 (Niche Inflation): Vote volume affects perceived quality - obscure old films are rated by superfans

The Data

We analyzed 315,794 movies from IMDb spanning 1920-2023, including ratings from millions of users. Let's see what the data actually tells us.

Figure 1.1: Are movies getting worse? This interactive bar chart displays the average IMDb movie rating for each decade from the 1900s to the 2020s. At first glance, the blue bars are significantly higher for the 1920s, 30s, 40s, and 50s. This visual trend represents the commonly held belief in a "Golden Age" of cinema, where older movies are perceived as superior. But is this true quality, or are we just looking at the statistics incorrectly? This analysis will investigate that mystery.

Data Validation & Quality Checks

Before we test our hypotheses, we need to understand our data. What are we actually measuring? What assumptions are we making?

Key Assumptions

  • Time Range: 1920-2023 (excludes very early cinema with sparse data)
  • Minimum Sample Size: Decades with fewer than 50 movies are excluded
  • Title Type: Only feature films (movies), excluding TV shows and shorts
  • Rating Source: IMDb user ratings (crowd-sourced, not critic reviews)

Figure 2.1: Checking our data. Before we start drawing conclusions, we need to trust our data. This chart checks what percentage of movies in our dataset actually have the necessary information: a user rating and a vote count. As the consistent height of the bars shows, over 95% of movies from every single decade have complete data. This high level of completeness means we don't have to worry about missing information skewing our results—our dataset is solid.

Figure 2.2: The explosion of movies. Here we are counting the total number of movies released in each decade. The result is shocking: the film industry has grown exponentially. The bar for the 2010s towers over the rest, showing that more movies were released in that one decade than in the first 80 years of cinema combined. This "explosion" means we are comparing a tiny handful of older classics against a massive flood of modern content.

Figure 2.3: Popularity contest. This histogram shows the distribution of "popularity" as measured by the number of votes a movie receives. The tall bar on the far left shows that the vast majority of movies receive very few votes—often fewer than 1,000. Meanwhile, the long tail to the right represents the rare blockbusters that get hundreds of thousands of votes. This extreme unevenness is crucial: most "movies" are obscure titles that almost no one has seen, while our perception of quality is often driven by the few popular hits.

Data Quality Summary

âś“ High completeness across all decades
âś“ Large sample sizes (especially recent decades)
âš  Vote counts vary dramatically (will need to account for this)
âš  Sample size imbalance across decades (more recent = more movies)

Testing Hypothesis 2: Survivorship Bias

Here's the key insight: we're comparing apples to oranges.

For the 1950s, we mostly have data on movies that were good enough to be remembered, preserved, and rated by modern audiences. The terrible B-movies from 1954? Most are lost to time or have fewer than 100 votes on IMDb.

But for 2023? We have everything. Every straight-to-streaming flop, every low-budget indie that nobody watched. We're comparing the greatest hits of 1950 to the complete filmography of 2023.

The Experiment

What if we only look at popular movies from every decade? Movies with at least 1,000 or 10,000 votes? This filters out the forgotten garbage from all eras, not just recent ones.

Figure 3.1: Comparing apples to apples. This side-by-side comparison reveals the first crack in the myth. The left chart includes *every* movie, showing the familiar "Old is Better" trend. But the right chart includes *only* popular movies—those with at least 1,000 votes. Look at what happens: the trend lines flatten out completely. When we stop comparing obscure amateur films against famous classics and instead compare popular movies against popular movies, the modern era holds its own. The "Golden Age" effectively vanishes.

Finding: Hypothesis 2 Confirmed âś“

The apparent superiority of old movies is largely due to survivorship bias. When we filter by popularity (a proxy for "movies people actually remember"), the gap shrinks dramatically.

The 1950s weren't better at making movies. We just forgot about all the bad ones.

Testing Hypothesis 3: Niche Inflation

There's another problem with the raw averages: who's doing the rating?

Consider an obscure 1970s arthouse film with 500 votes. Who's rating it? Film school students, hardcore cinephiles, people who specifically sought it out. They're giving it 9s and 10s because they love that kind of movie.

Now consider a new Marvel movie with 500,000 votes. Who's rating it? Everyone. Your grandma, random teenagers, people who hate superhero movies but watched it anyway. The average is pulled down by casual viewers.

The Solution: Vote Weighting

We can correct for this by weighting each movie's rating by its vote count (using log transformation to prevent blockbusters from dominating). This gives more influence to widely-seen movies and less to niche favorites.

Figure 4.1: The Superfan Effect. This chart introduces a "Vote Weighting" adjustment. The blue line is the raw average, while the pink line gives more weight to movies with more votes. Notice how the pink line drops significantly for the older decades? This suggests that many obscure older films have inflated ratings because they are rated only by small groups of dedicated fans or cinephiles, not the general public. When we force these niche favorites to compete on popularity, their average score drops, further debunking the idea of a superior past.

Finding: Hypothesis 3 Confirmed âś“

Niche inflation is real. Obscure old movies benefit from being rated primarily by enthusiasts, while popular modern movies face harsher crowds. Vote-weighting corrects this bias.

Combined with survivorship bias, these two effects explain most of the "golden age" illusion.

The Content Explosion & Variance

Even if average quality hasn't changed, something else definitely has: volume.

In the 1950s, you needed a studio, expensive equipment, and distribution deals to make a movie. Today? You need a camera and YouTube. The barriers to entry have collapsed.

Figure 5.1: A flood of content. This line chart tracks the cumulative number of movies over time, emphasizing the sheer scale of modern production. We are literally drowning in content. The steeper the slope, the faster the library of human cinema is growing. This massive volume of modern movies makes it harder to find the gems, creating the *feeling* that quality has dropped simply because there is so much more "noise" to sift through.

Figure 5.2: The gap widens. With more movies comes more variety in quality. This chart visualizes the "spread" or variance of ratings. In recent decades, the gap between the best and worst movies has widened enormously. We are producing more terrible movies than ever before, but we are also producing just as many great ones. The "Golden Age" illusion comes from survivorship bias: time filters out the garbage from the past, leaving only the classics, while today we face the full unfiltered spectrum of quality every time we open a streaming app.

Key Insight

The problem isn't that quality has declined. The problem is that there's too much stuff.

In 1950, if you went to the movies, you had maybe 10-20 options. Today? Thousands. The masterpieces are still being made - they're just harder to find in the noise.

Conclusions & Decisions

What We Learned

  • H1 (Initial Observation): âś“ Confirmed - Raw data shows older movies have higher ratings
  • H2 (Survivorship Bias): âś“ Confirmed - This is largely due to selective preservation
  • H3 (Niche Inflation): âś“ Confirmed - Vote volume significantly affects perceived quality

Verdict: The "golden age" wasn't actually better. It was just filtered.

Figure 6.1: The big picture. This final visualization brings all our adjustments together. It shows how the "raw" rating trend (Blue) gradually flattens out as we apply our corrections for popularity (Orange) and sample size (Green). By the time we account for all the statistical quirks—survivorship bias, the superfan effect, and the explosion of content—the line becomes almost flat. The conclusion? Movies haven't gotten worse. We've just gotten better at counting them, and there are a lot more of them.

Decision Framework

For Viewers: How to Find Quality Content

  • Don't trust decade-based generalizations - quality exists in every era
  • Look for movies with substantial vote counts (>1,000) to avoid niche inflation
  • Use filters and curated lists to cut through the noise
  • Remember: more options = harder to choose, but gems are still there

For Creators: Quality Standards Haven't Changed

  • Great storytelling works in any decade - the fundamentals are timeless
  • You're competing with more content, not higher standards
  • Focus on standing out in a crowded market, not matching some mythical past

For Platforms: Curation vs. Volume Trade-offs

  • More content increases variance - curation becomes more valuable
  • Recommendation algorithms must account for vote volume bias
  • Consider surfacing older content that survived the filter (it's probably good)
  • Help users navigate abundance, don't just add more

Next Experiments: What to check if we had one more week

  • Genre-specific analysis: Has quality changed differently for action vs. drama?
  • International comparison: Is this effect global or US-centric?
  • Critic vs. user ratings: Do professional reviewers show the same bias?
  • Temporal analysis: When exactly did the content explosion begin?
  • Budget correlation: Does production cost predict quality across decades?

Final Thought

The next time someone says "they don't make 'em like they used to," you can confidently reply: "Actually, they do. We just forgot about all the bad ones from back then."